Learning semantic hierarchy with distributed representations for unsupervised spoken language understanding

نویسندگان

  • Yun-Nung Chen
  • William Yang Wang
  • Alexander I. Rudnicky
چکیده

We study the problem of unsupervised ontology learning for semantic understanding in spoken dialogue systems, in particular, learning the hierarchical semantic structure from the data. Given unlabelled conversations, we augment a frame-semantic based unsupervised slot induction approach with hierarchical agglomerative clustering to merge topically-related slots (e.g., both slots “direction” and “locale” convey location-related information) for building a coherent semantic hierarchy, and then estimate the slot importance at different levels. The high-level semantic estimation involves not only within-slot but also crossslot relations. The experiments show that high-level semantic information can accurately estimate the prominence of slots, significantly improving the slot induction performance; furthermore, a semantic decoder trained on the data with automatically extracted slots achieves about 68% F-measure, which is close to the one from hand-crafted grammars.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the difficulty of a distributional semantics of spoken language

The bulk of research in the area of speech processing concerns itself with supervised approaches to transcribing spoken language into text. In the domain of unsupervised learning most work on speech has focused on discovering relatively low level constructs such as phoneme inventories or word-like units. This is in contrast to research on written language, where there is a large body of work on...

متن کامل

Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic an...

متن کامل

Approximate Inference for Domain Detection in Spoken Language Understanding

This paper presents a semi-latent topic model for semantic domain detection in spoken language understanding systems. We use labeled utterance information to capture latent topics, which directly correspond to semantic domains. Additionally, we introduce an ’informative prior’ for Bayesian inference that can simultaneously segment utterances of known domains into classes and divide them from ou...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015